Goto

Collaborating Authors

 expert strategy


Combining model tracing and constraint-based modeling for multistep strategy diagnoses

van der Hoek, Gerben, Jeuring, Johan, Bos, Rogier

arXiv.org Artificial Intelligence

Model tracing and constraint-based modeling are two approaches to diagnose student input in stepwise tasks. Model tracing supports identifying consecutive problem-solving steps taken by a student, whereas constraint-based modeling supports student input diagnosis even when several steps are combined into one step. We propose an approach that merges both paradigms. By defining constraints as properties that a student input has in common with a step of a strategy, it is possible to provide a diagnosis when a student deviates from a strategy even when the student combines several steps. In this study we explore the design of a system for multistep strategy diagnoses, and evaluate these diagnoses. As a proof of concept, we generate diagnoses for an existing dataset containing steps students take when solving quadratic equations (n=2136). To compare with human diagnoses, two teachers coded a random sample of deviations (n=70) and applications of the strategy (n=70). Results show that that the system diagnosis aligned with the teacher coding in all of the 140 student steps.


FlowHFT: Imitation Learning via Flow Matching Policy for Optimal High-Frequency Trading under Diverse Market Conditions

Li, Yang, Chen, Zhi, Yang, Steve

arXiv.org Artificial Intelligence

High-frequency trading (HFT) is an investing strategy that continuously monitors market states and places bid and ask orders at millisecond speeds. Traditional HFT approaches fit models with historical data and assume that future market states follow similar patterns. This limits the effectiveness of any single model to the specific conditions it was trained for. Additionally, these models achieve optimal solutions only under specific market conditions, such as assumptions about stock price's stochastic process, stable order flow, and the absence of sudden volatility. Real-world markets, however, are dynamic, diverse, and frequently volatile. To address these challenges, we propose the FlowHFT, a novel imitation learning framework based on flow matching policy. FlowHFT simultaneously learns strategies from numerous expert models, each proficient in particular market scenarios. As a result, our framework can adaptively adjust investment decisions according to the prevailing market state. Furthermore, FlowHFT incorporates a grid-search fine-tuning mechanism. This allows it to refine strategies and achieve superior performance even in complex or extreme market scenarios where expert strategies may be suboptimal. We test FlowHFT in multiple market environments. We first show that flow matching policy is applicable in stochastic market environments, thus enabling FlowHFT to learn trading strategies under different market conditions. Notably, our single framework consistently achieves performance superior to the best expert for each market condition.


Random feature-based double Vovk-Azoury-Warmuth algorithm for online multi-kernel learning

Rokhlin, Dmitry B., Gurtovaya, Olga V.

arXiv.org Artificial Intelligence

We introduce a novel multi-kernel learning algorithm, VAW$^2$, for online least squares regression in reproducing kernel Hilbert spaces (RKHS). VAW$^2$ leverages random Fourier feature-based functional approximation and the Vovk-Azoury-Warmuth (VAW) method in a two-level procedure: VAW is used to construct expert strategies from random features generated for each kernel at the first level, and then again to combine their predictions at the second level. A theoretical analysis yields a regret bound of $O(T^{1/2}\ln T)$ in expectation with respect to artificial randomness, when the number of random features scales as $T^{1/2}$. Empirical results on some benchmark datasets demonstrate that VAW$^2$ achieves superior performance compared to the existing online multi-kernel learning algorithms: Raker and OMKL-GF, and to other theoretically grounded method methods involving convex combination of expert predictions at the second level.


RTBAgent: A LLM-based Agent System for Real-Time Bidding

Cai, Leng, He, Junxuan, Li, Yikai, Liang, Junjie, Lin, Yuanping, Quan, Ziming, Zeng, Yawen, Xu, Jin

arXiv.org Artificial Intelligence

Real-Time Bidding (RTB) enables advertisers to place competitive bids on impression opportunities instantaneously, striving for cost-effectiveness in a highly competitive landscape. Although RTB has widely benefited from the utilization of technologies such as deep learning and reinforcement learning, the reliability of related methods often encounters challenges due to the discrepancies between online and offline environments and the rapid fluctuations of online bidding. To handle these challenges, RTBAgent is proposed as the first RTB agent system based on large language models (LLMs), which synchronizes real competitive advertising bidding environments and obtains bidding prices through an integrated decision-making process. Specifically, obtaining reasoning ability through LLMs, RTBAgent is further tailored to be more professional for RTB via involved auxiliary modules, i.e., click-through rate estimation model, expert strategy knowledge, and daily reflection. In addition, we propose a two-step decision-making process and multi-memory retrieval mechanism, which enables RTBAgent to review historical decisions and transaction records and subsequently make decisions more adaptive to market changes in real-time bidding. Empirical testing with real advertising datasets demonstrates that RTBAgent significantly enhances profitability. The RTBAgent code will be publicly accessible at: https://github.com/CaiLeng/RTBAgent.


MOT: A Mixture of Actors Reinforcement Learning Method by Optimal Transport for Algorithmic Trading

Cheng, Xi, Zhang, Jinghao, Zeng, Yunan, Xue, Wenfang

arXiv.org Artificial Intelligence

Algorithmic trading refers to executing buy and sell orders for specific assets based on automatically identified trading opportunities. Strategies based on reinforcement learning (RL) have demonstrated remarkable capabilities in addressing algorithmic trading problems. However, the trading patterns differ among market conditions due to shifted distribution data. Ignoring multiple patterns in the data will undermine the performance of RL. In this paper, we propose MOT, which designs multiple actors with disentangled representation learning to model the different patterns of the market. Furthermore, we incorporate the Optimal Transport (OT) algorithm to allocate samples to the appropriate actor by introducing a regularization loss term. Additionally, we propose Pretrain Module to facilitate imitation learning by aligning the outputs of actors with expert strategy and better balance the exploration and exploitation of RL. Experimental results on real futures market data demonstrate that MOT exhibits excellent profit capabilities while balancing risks.


On Strategy Stitching in Large Extensive Form Multiplayer Games

Neural Information Processing Systems

Computing a good strategy in a large extensive form game often demands an extraordinary amount of computer memory, necessitating the use of abstraction to reduce the game size. Typically, strategies from abstract games perform better in the real game as the granularity of abstraction is increased. This paper investigates two techniques for stitching a base strategy in a coarse abstraction of the full game tree, to expert strategies in fine abstractions of smaller subtrees. We provide a general framework for creating static experts, an approach that generalizes some previous strategy stitching efforts. In addition, we show that static experts can create strong agents for both 2-player and 3-player Leduc and Limit Texas Hold'em poker, and that a specific class of static experts can be preferred among a number of alternatives. Furthermore, we describe a poker agent that used static experts and won the 3-player events of the 2010 Annual Computer Poker Competition.


Can Large Language Models Replace Economic Choice Prediction Labs?

Shapira, Eilam, Madmon, Omer, Reichart, Roi, Tennenholtz, Moshe

arXiv.org Artificial Intelligence

Economic choice prediction is an essential challenging task, often constrained by the difficulties in acquiring human choice data. Indeed, experimental economics studies had focused mostly on simple choice settings. The AI community has recently contributed to that effort in two ways: considering whether LLMs can substitute for humans in the above-mentioned simple choice prediction settings, and the study through ML lens of more elaborated but still rigorous experimental economics settings, employing incomplete information, repetitive play, and natural language communication, notably language-based persuasion games. This leaves us with a major inspiration: can LLMs be used to fully simulate the economic environment and generate data for efficient human choice prediction, substituting for the elaborated economic lab studies? We pioneer the study of this subject, demonstrating its feasibility. In particular, we show that a model trained solely on LLM-generated data can effectively predict human behavior in a language-based persuasion game, and can even outperform models trained on actual human data.


Prediction of Locally Stationary Data Using Expert Advice

V'yugin, Vladimir, Trunov, Vladimir

arXiv.org Artificial Intelligence

Predicting data coming from a "black box" is one of the main tasks of machine learning. In this case, no stochastic assumptions about data source is used. The data comes online as a time series consisting of pairs of the form ("signal", "response"). The data source can be an analog, deterministic (algorithmic) or stochastic process. In this case, we will use simple structural assumptions about the source of the data. In this paper, an approach is proposed in which training is performed on small subsamples of the main sample, forecasts of the constructed predictive models are combined into one common forecast based on the known aggregation methods. The general scheme of the online learning process is as follows. The learning process occurs at discrete times in steps t = 1,2,.... At the next step t, according to the data from the subsample, from the data observed in the past, a local predictive model (expert predictive strategy) is defined to obtain a response to the signal.


Learning to Navigate in a VUCA Environment: Hierarchical Multi-expert Approach

Zhang, Wenqi, Zhao, Kai, Li, Peng, Zhu, Xiao, Ye, Faping, Jiang, Weijie, Fu, Huiqiao, Wang, Tao

arXiv.org Artificial Intelligence

Despite decades of efforts, robot navigation in a real scenario with volatility, uncertainty, complexity, and ambiguity (VUCA for short), remains a challenging topic. Inspired by the central nervous system (CNS), we propose a hierarchical multi-expert learning framework for autonomous navigation in a VUCA environment. With a heuristic exploration mechanism considering target location, path cost, and safety level, the upper layer performs simultaneous map exploration and route-planning to avoid trapping in a blind alley, similar to the cerebrum in the CNS. Using a local adaptive model fusing multiple discrepant strategies, the lower layer pursuits a balance between collision-avoidance and go-straight strategies, acting as the cerebellum in the CNS. We conduct simulation and real-world experiments on multiple platforms, including legged and wheeled robots. Experimental results demonstrate our algorithm outperforms the existing methods in terms of task achievement, time efficiency, and security.


4 Business Strategies for Implementing Artificial Intelligence

#artificialintelligence

Artificial intelligence (AI) is reinventing industry after industry. In China, AI is tutoring children in more than 1700 schools across 200 cities. In Australia, an AI created a flu vaccine that far outperformed all other existing flu vaccines. In the US, a machine-learning robot is autonomously cooking burgers. There are many incredible real world-implementations of AI.